Test Collections and Measures for Evaluating Customer-Helpdesk Dialogues

نویسندگان

  • Zhaohao Zeng
  • Cheng Luo
  • Lifeng Shang
  • Hang Li
  • Tetsuya Sakai
چکیده

We address the problem of evaluating textual, task-oriented dialogues between the customer and the helpdesk, such as those that take the form of online chats. As an initial step towards evaluating automatic helpdesk dialogue systems, we have constructed a test collection comprising 3,700 real Customer-Helpdesk multiturn dialogues by mining Weibo, a major Chinese social media. We have annotated each dialogue with multiple subjective quality annotations and nugget annotations, where a nugget is a minimal sequence of posts by the same utterer that helps towards problem solving. In addition, 10% of the dialogues have been manually translated into English. We have made our test collection DCH-1 publicly available for research purposes. We also propose a simple nugget-based evaluation measure for task-oriented dialogue evaluation, which we call UCH, and explore its usefulness and limitations.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Multi-domain case-based module for customer support

Technology Management Centres provide technological and customer support services for private or public organisations. Commonly, these centres offer support using a helpdesk software that facilitates the work of their operators. In this paper, a CBR module that acts as a solution recommender for customer support environments is presented. The CBR module is flexible and multi-domain, in order to...

متن کامل

Dynamic active probing of helpdesk databases

Helpdesk databases are used to store past interactions between customers and companies to improve customer service quality. One common scenario of using helpdesk database is to find whether recommendations exist given a new problem from a customer. However, customers often provide incomplete or even inaccurate information. Manually preparing a list of clarification questions does not work for l...

متن کامل

Decision Trees for Helpdesk Advisor Graphs

We use decision trees to build a helpdesk agent reference network to facilitate the on-the-job advising of junior or less experienced staff on how to better address telecommunication customer fault reports. Such reports generate field measurements and remote measurements which, when coupled with location data and client attributes, and fused with organization-level statistics, can produce model...

متن کامل

Towards Automatic Evaluation of Multi-Turn Dialogues: A Task Design that Leverages Inherently Subjective Annotations

ABSTRACT ‘is paper proposes a design of a shared task whose ultimate goal is automatic evaluation of multi-turn, dyadic, textual helpdesk dialogues. ‘e proposed task takes the form of an o„ine evaluation, where participating systems are given a dialogue as input, and output at least one of the following: (1) an estimated distribution of the annotators’ quality ratings for that dialogue; and (2)...

متن کامل

Evaluating Spoken Language Systems

Spoken language systems (SLSs) for accessing information sources or services through the telephone network and the Internet are currently being trialed and deployed for a variety of tasks. Evaluating the usability of different interface designs requires a method for comparing performance of different versions of the SLS. Recently, Walker et al (1997) proposed PARADISE (PARAdigm for DIalogue Sys...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2017